The equivalence of logistic regression and maximum entropy models

نویسنده

  • John Mount
چکیده

As our colleague so aptly demonstrated ( http://www.win-vector.com/blog/2011/09/the-simplerderivation-of-logistic-regression/ (link) ) there is one derivation of Logistic Regression that is particularly beautiful. It is not as general as that found in Agresti[Agresti, 1990] (which deals with generalized linear models in their full generality), but gets to the important balance equations very quickly. We will pursue this further to re-derive multi-category logistic regression in both its standard (sigmoid) phrasing and also in its equivalent maximum entropy clothing. It is well known that logistic regression and maximum entropy modeling are equivalent (for example see [Klein and Manning, 2003])but we will show that the simpler derivation already given is a very good way to demonstrate the equivalence (and points out that logistic regression is actually specialnot just one of many equivalent GLMs). 1 Overview We will proceed as follows: 1. This outline. 2. Introduce a simplified machine learning problem and some notation. 3. Re-discover logistic regression by a simplified version of the standard derivations. 4. Re-invent logistic regression by using the maximum entropy method. 5. Draw some conclusions. 2 Notation Suppose our machine learning input is a sequence of real vectors of dimension n (where we have preprocessed in the machine learning tricks of converting categorical variables into indicators over levels and adding a constant variable to our representation). Our notation will be as follows: 1. x(1) · · ·x(m) will denote our input data. Each x(i) is a vector in Rn. We will use the function of i notation to denote which specific example we are working with. We will also use the variable j to denote which of the n coordinates or parameters we are interested in (as in x(i)j). ∗email: mailto:[email protected] web: http://www.win-vector.com/

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Microeconomic Interpretation of the Maximum Entropy Estimator of Multinomial Logit Models and Its Equivalence to the Maximum Likelihood Estimator

Maximum entropy models are often used to describe supply and demand behavior in urban transportation and land use systems. However, they have been criticized for not representing behavioral rules of system agents and because their parameters seems to adjust only to modeler-imposed constraints. In response, it is demonstrated that the solution to the entropy maximization problem with linear cons...

متن کامل

Statistical Models for Presence-Only Data: Finite-Sample Equivalence and Addressing Observer Bias

Statistical modeling of presence-only data has attracted much recent attention in the ecological literature, leading to a proliferation of methods, including the inhomogeneous poisson process (IPP) model [15], maximum entropy (Maxent) modeling of species distributions [12] [9] [10], and logistic regression models. Several recent articles have shown the close relationships between these methods ...

متن کامل

Maximum entropy, logistic regression, and species abundance

There is considerable debate about the utility of statistical mechanics in predicting diversity patterns in terms of life history traits. Here, I reflect on this debate and show that a community is controlled by the balance of two opposite forces: the entropic part (the natural tendency of the system to be in the configuration with the highest possible entropy) and environmental, ecological and...

متن کامل

A Note on the Bivariate Maximum Entropy Modeling

Let X=(X1 ,X2 ) be a continuous random vector. Under the assumption that the marginal distributions of X1 and X2 are given, we develop models for vector X when there is partial information about the dependence structure between X1  and X2. The models which are obtained based on well-known Principle of Maximum Entropy are called the maximum entropy (ME) mo...

متن کامل

Discrete Choice and Rational Inattention: a General Equivalence Result

This paper establishes a general equivalence between discrete choice and rational inattention models. Matejka and McKay (2015, AER) showed that when information costs are modelled using the Shannon entropy function, the resulting choice probabilities in the rational inattention model take the multinomial logit form. By exploiting convex-analytic properties of the discrete choice model, we show ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012